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Abstract 

Despite their diverse origin, networks of large real-world systems reveal a number of 
common properties including small-world phenomena, scale-free degree distributions and 
modularity. Recently, network self-similarity as a natural outcome of the evolution of 
real-world systems has also attracted much attention within the physics literature. Here 
we investigate the scaling of density in complex networks under two classical box-covering 
renormalizations — network coarse-graining — and also different community-based renor- 
malizations. The analysis on over 50 real-world networks reveals a power-law scaling 
of network density and size under adequate renormalization technique, yet irrespective 
of network type and origin. The results thus advance a recent discovery of a univer- 
sal scaling of density among different real- world networks [Laurienti et al., Physica A 
390 (20) (2011) 3608-3613.] and imply an existence of a scale-free density also within — 
among different self-similar scales of — complex real- world networks. The latter further 
improves the comprehension of self-similar structure in large real-world networks with 
several possible applications. 
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1. Introduction 

The study of complex real-world networks and underlying systems has erupted in 
recent years in various fields of science. Due to their simple and intelligible form, networks 
enable representation of diverse systems of complex interactions and provide for their 
common investigation. Thus, several fundamental properties of large real- world networks 
have been revealed in the past decade. These include small- world phenomena [1], scale- 
free degree distributions [H [3], network clustering [TJ H] and robustness [3 [S], degree 
mixing [71 |H], community and hierarchical structure [HI UHli network motifs [TT] and 
other [12] (for reviews see [T31[T3]). More recently, network self-similarity as an inherent 
property behind the evolution of real-world systems has also attracted much of attention 
within the physics community [l5 l fTB l llT l flS l [T9 l [20] . 
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Figure 1 : Network renormalization — system coarse-graining — technique |16l I25| applied to a small ex- 
ample network. At each step, the network is covered with boxes that are replaced by super-nodes. The 
latter are linked when a corresponding link also exists in the (original) network. The process then re- 
peats until only a single node remains or multiple nodes in the case of a disconnected network. (Here 
the network is randomly tiled with boxes of nodes at a distance smaller than 2.) 

Network self-similarity is commonly considered alongside the concept of fractal net- 
works [13 HI] . Fractality is a property of a geometric object that it is exactly or ap- 
proximately similar to a part of itself [22|. Nevertheless, classical theory of self-similarity 
requires a power-law scaling between the system size and its parts under some renor- 
malization [531 IMj ■ The latter is an iterative process where a system is coarse-grained 
into smaller replicas, thus its essential structural features are preserved [THlllS] (Fig. [T]). 
Hence, fractal or self-similar networks commonly refer only to a self-similar scaling ex- 
ponent in the aforementioned power-law relation [16l [26l [181 [27] • However, network self- 
similarity is also investigated in the context of other network properties [T5 l [T7 l ITS l [T9 l [28] 
under various renormalization techniques [TBI E51 150] (Note that fractal scaling laws 
observed in real- world networks do not necessarily imply a self-similar network [31j.) 

Guimera et al. |15| have first observed self-similar community size distributions in a 
network of human communications. Furthermore, Song et al. |16[ 126) have proposed an 
adequate renormalization technique (Fig. [T]) to expose the origin of self-similar fractal 
scaling in web, collaboration and different biological networks. The latter in fact gives rise 
to degree disassortativity [IS] and resilience to diseases [35] , commonly observed for these 
networks. Still, such scaling cannot coexist with a small- world network topology [53', "31]. 
Self-similarity has also been considered as a scale-invariance of degree distribution |18il?7] 
or maximum degree [191135] under network renormalization, while Itzkovitz et al. [17j have 
revealed self-dissimilarity in a motif structure for different biological and technological 
networks. Authors have also considered network self-similarity in the context of different 
dynamical processes including percolation [3B] and synchronization [37]. 

Despite the above efforts, there is yet little evidence whether self-similarity exists 
only in certain networks and which properties are indeed invariant throughout different 
network scales. We thus here investigate the scaling of density — defined as the number 
of links to all possible links — with respect to network size under five renormalization 
techniques borrowed from the field of fractal networks [THl [2S] and community detection 
literature [3H1 IM] . Analysis on over 50 real- world networks of diverse origin reveals a self- 
similar power-law scaling of network density and size (under suitable renormalization). 
The latter advances a recent work of Laurienti et al. |40j who have observed a universal 
scaling of density among different real-world networks, while Leskovec et al. [H] H5] 
have also found similar densification laws in evolving networks. The results thus imply 
an existence of a scale-free density not only among, but also within — among different 
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self-similar scales of — complex networks irrespective of their type and the underlying 
domain. Hence, under adequate renormalization self-similar real-world networks neither 
get denser nor sparser with respect to their size, whereas characteristic network topology 
is also largely retained throughout the renormalization. 

The rest of the paper is structured as follows. Section [2] introduces different renor- 
malization techniques and real-world network data adopted in the research. Empirical 
analysis with formal discussion on real-world and random networks is presented in Sec- 
tion [3] while Section [4] gives final conclusions and discusses future work. 

2. Techniques and network data 

Self-similarity is primarily studied under the framework of network renormaliza- 
tion |161 118| . As already discussed, renormalization is an iterative coarse-graining tech- 
nique, where the original network is covered with boxes, thus each node belongs to exactly 
one box [mils] (Fig.[l]). Boxes are then replaced by super-nodes that are linked when a 
corresponding link also exists in the (original) network. The entire process repeats until 
no links remain and the number of nodes equals to the number of connected components. 

While there exists a number of different box-covering approaches, not all of them 
are able to reveal self-similar scales in complex networks. Thus, we employ techniques 
that have already proven useful for exposing self-similarity in various real-world net- 
works [m [TBI nil US]- In particular, we adopt methods commonly used in analysis of 
fractal networks, as well as different community detection algorithms. 

Fractal network structure is mainly explored under two general classes of renormal- 
ization techniques, namely, node coloring and network burning approaches |161 129] (for 
reviews see [201 127)). In the former, box-covering is mapped to a node coloring prob- 
lem [331 Ell whereas, in the latter, boxes are grown around a randomly selected seed 
node. Although there exist several efficient algorithms for node coloring [44l|45], net- 
work burning methods offer some distinct advantages [291 . Different authors have pro- 
posed a wide range of alternative network coarse-graining techniques including methods 
based on connectivity patterns [T7], skeleton of the network [5S], link-covering [IB] and 
other [HIlTlEnilSZl- 

For the purpose of this research, we adopt two classical network burning approaches. 
First, box-tiling method, randomly tiles the network with boxes of nodes that are at a 
distance smaller than Is [TB1[2B] (Fig.[T]). Second, cluster-growing method, incrementally 
grows boxes from randomly selected seed nodes within a distance not larger than rs 12^1 
H7] . Hence, for random configurations, Ib — "2 ■ + 1 Box-tilling method allows 

for somewhat easier analytical consideration, whereas cluster-growing approach enables 
more efficient implementation. For the analysis in Section [S] we set Is to 3 and to 2 
with respect to network small- worlds 1] . Note that the latter extends the definition of an 
egonet [JH 135] — a subnetwork inferred by a central ego node and its neighbors — which 
can be seen as a local signature of the respective node. 

We further adopt several algorithms drawn from community detection literature (for 
reviews see [38j [391). Here boxes are identified by communities [9] — groups of nodes 
densely connected within and only loosely connected between — revealed with selected 
algorithm, whereas network coarse-graining procedure is else identical as above. Com- 
munity detection has already been successfully employed to reveal self-similarity in real- 
world networks [15] . Recent work also implies an existence of community structures on 
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Table 1: Real-world networks, (n and m correspond to the number of nodes and links, respectively.) 



Network 


Typo 


n 


m 


Zachary's karate club [50| 
Lusscau's dolphins |51| 
Comp. sci. PhD students |52| 


Social 


34 
62 
1025 


78 
159 
1043 


Facebook friendships 1531 
Wikipedia who-votes-who 1541 


On-line social 


324 
7066 


2218 
100736 


Slovenian comp. science 1531 
Krebs's Internet industry |52| 
Complex networks science 1551 
Paul Erdos collaborations |52| 
Comput. Geometry archive 1561 
General Relativity archive |41| 
PGP wcb-of-trust fFf 
Astro Physics archive 42] 


Collaboration 


239 
219 
379 
446 
3621 
4158 
10680 
17903 


568 
630 
914 
1413 
9461 
13422 
24316 
196972 


US political books !5_8, 


Co-purchase 


105 


441 


amazon.com domain |59l 

epa.gov domain |52| 
Broad-topic queries |60| 
US political blogs |58| 


Web graph 


2879 
4253 
5925 
1222 


3886 
8897 
15770 
16714 



Graph Drawing proceedings 1521 




249 


635 


Stanley Milgrani citations 1521 




233 


994 


H. Small & B. Griffith citations [52] 




1024 


4916 


Scientometrics archive 1521 


Citation 


2678 


10368 


Teuvo Kohonen citations 1521 


3704 


12673 


Joshua Lederberg citations 1521 




8212 


41430 


Ahmed Zewail citations 1521 




6640 


54173 


High E. Particle Phys. archive 1611 




27400 


352021 



Mobile phone records 1621 




345 


355 


Emails at a university 1151 


Communication 


1133 


5451 


Emails at Enron 1631 




33696 


180811 


Novel David Copperfield 1551 




112 


425 


Roget's Thesaurus dictionary 1641 




994 


3640 


Java documentation (javax) 1651 




1031 


4408 


ODLIS dictionary [66] 


Information 


2898 


16376 


USF association norms 1671 




10617 


63782 


FOLDOC dictionary [68] 




13356 


91471 


WordNet dictionary '52] 




75606 


119564 



Small software project 52 
JUNG graph framework 69) 

Java language (javax) 69j 
Java language (general) 52] 


Software 


83 
398 
1570 
1538 


125 
943 
7194 
7817 


Oregon aut. systems |70 
Gnutella file sharing j42t 


Internet 


22963 
36646 


48436 
88303 


European roads 1711 
Finite automaton 1521 
US air lines \52\ 
US power grid |52| 


Technological 


1039 
1096 
332 
4941 


1305 
1677 
2126 
6594 


Escherichia Coli regulatory 1521 
Caenorhabditis Elegans neural 
Yeast protein interactions |72| 


Biological 


328 
297 
2224 


456 
2148 
6609 


Data modeling |52| 


Other 


638 


1020 


Amazon products 1731 


Co-purchase 


524366 


1491774 


nd.edu domain 1741 


Web graph 


325729 


1497135 


Pennsylvania roads 1631 


Technological 


1087562 


1541514 


Wikipedia talk service 1541 


Communication 


2388953 


4656682 


Skitter overlay map I41I 


Internet 


1694616 


11094209 
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various scales of complex real- world networks [751 ES] ■ Hence, community detection ap- 
pears to be an adequate alternative to classical box-covering renormalization techniques. 

Due to generality, we consider three diverse community detection algorithms. First, 
we adopt balanced propagation [7T] as an example of a highly scalable state-of-the-art al- 
gorithm. The approach is based on the label propagation principle of Raghavan et al. [77], 
while node balancers are introduced to improve the stability of the algorithm (stability 
parameter is set to 1/4). Next, we employ a fast hierarchical optimization of modular- 
ity Q jTSj proposed by Clauset et al. [75] as one of most widely used approaches in the past 
literature |38| . However, due to many limitations of the measure of modularity Q, high 
values of Q cannot be regarded as an indication of network community structure [801 181j . 
Last, we also consider a spectral algorithm of Newman |55j as a representative of a 
partitioning approach with origins in classical graph theory |82| . The algorithm reveals 
communities by extracting the leading eigenvector of network modularity matrix using 
a power method. 

Analysis in Section [3] is conducted on 55 real- world networks that are often analyzed 
in complex network literature (Table [T]), and also on random graphs a la Erdos-Renyi |83j 
and different generative graph models. The real-world networks range between tens of 
nodes and tens of millions of links; and include different social — classical, on-line, col- 
laboration etc.; information — web graphs, citation, communication etc.; technological — 
Internet, software, transportation etc.; biological — protein, genetic and neural; and other 
networks. Due to the large number of networks considered, detailed description is omit- 
ted. Still, networks were carefully chosen thus to represent a relatively diverse set of 
real-world systems including most types of networks commonly analyzed in the litera- 
ture. For simplicity, all networks are considered as simple undirected graphs and reduced 
to largest connected components. 

3. Analysis and discussion 

In the following we first analyze self-similar scaling of density in real- world networks of 
moderate size (Section )3.1| , while analysis on Erdos-Renyi random graphs and different 
generative graph models is given in Section |3.2| In Section |3.3| we further consider self- 
similarity of five larger real- world networks with at least a million links. 

3.1. Real-world networks 

The algorithms were first applied to 50 real- world networks (Table [T]). According to 
the number of nodes n and density d from original and reduced networks we examine 
the density scaling with respect to network size. In particular, d is expressed as a power 
function of n through formula d = c-n~^ , where 7 is a scaling exponent and c is a constant. 
We measure goodness of fit to the data using coefficient of determination — how well 
the network size predicts density — and dependence between both variables corresponding 
to Spearman's correlation coefficient p — the extent to which network density decreases 
as network size increases. Moreover, we also evaluate the number of self-similar scales S 
defining how many renornialized networks are revealed under different techniques. 

Mean estimates for each method appear in Table [2] Coefficients R^ demonstrate 
that the power-law relationship between the size and density appears to be a good fit to 
the data under box-covering methods and balanced propagation based renormalization. 



5 



Table 2: Estimates of the fit for power-law scaling of network density and size in 50 real-world networks 
revealed under different renormalization techniques. Values are estimates of the mean over 10 renor- 
malizations of each network and correspond to correlation coefficient p, coefficient of determination R? , 
expressed network density d and the number of revealed self-similar scales S. (For each technique, 
p and R? are obtained separately for original and renormalized networks, and for renormalized varieties 
only — first and second row, respectively. Bold values of R? indicate relatively high goodness of fit to a 
power-law, whereas values in italics show poor performance of the respective renormalization technique.) 



Technique 


P 


i?2 




d 




S 


Randomized box-tiling 


-0.975 
-0.973 


0.944 
0.936 


1.7 


rC 


0.807 


5.3 


Randomized cluster-growing 


-0.977 
-0.977 


0.948 
0.944 


1.6 


n" 


0.818 


4.6 


Balanced propagation 


-0.985 
-0.980 


0.962 
0.963 


1.9 


rC 


0.836 


4.3 


Modularity optimization 


-0.966 
-0.889 


0.956 

0.820 


3.0 


rC 


0.882 


3.9 


Spectral analysis 


-0.951 
-0.878 


0.922 
0.718 


4.1 


rC 


0.893 


4.5 


Original networks 


-0.924 


0.870 


3.8 


n" 


0.921 





(We can reject the null hypothesis — no actual relationship between variables — at one 
percent significance level, thus results are statistically significant.) Irrespective of renor- 
malization technique, and p for original networks are improved considering also their 
renormalized varieties. Otherwise, box-covering methods perform better than commu- 
nity detection algorithms, whereas balanced propagation exhibits the most homogeneous 
relationship between size and density. Spectral algorithm and modularity optimization 
prove the worst, particularly at observing fits for renormalized networks only. In the case 
of modularity optimization, this could be largely due to its resolution limit [80], lack of 
global maximum and degeneracy of optimal partitions [8r. On the other hand, spectral 
analysis is in fact an optimization of eigenvectors of the modularity matrix. Therefore, 
it is attributed to the above mentioned modularity limitations, whereas it also reveals 
modules in random networks '84] ■ 

The plots on Fig. [2] illustrate size and density relationships with the scaling exponents 
7 around —0.85. Original networks exhibit greater scaling factor (see also Section 3.3 1, 
which indicates 7 is approaching —1 for adequately large n. This corresponds to com- 
monly observed finding that most large-scale real-world networks tend to be sparse — the 
number of links appears not to be close to O(n^) but rather of order 0{n). Consecutively, 
we can simplify density definition with the relationship d « nT^. Thus, power-law rela- 
tionship between the network size and density is expected for original networks (without 
considering reduced varieties). However, among renormalized networks the relationships 
follow even stronger power-laws. This means that networks obtained on different scales 
of renormalization process also satisfy power-law relationship between size and density, 
and implies an existence of density scaling also within real-world networks. 
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Figure 2: Power-law scaling of network density and size in 50 real-world networks of diverse origin 
revealed with different renormalization techniques. Plots show scaling of density for a single renormal- 
ization of each network under respective technique. (Green triangles correspond to original networks, 
whereas blue circles represent their renormalized varieties. Symbol sizes are proportional to the number 
of networks with the same size and density.) 
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Furthermore, results show similar behavior of exponents 7 and constants c for better 
performing techniques, including box-tiling, cluster-growing, and balanced propagation. 
This finding implies that box-covering methods find smaller and sparser boxes, similar 
to communities detected with balanced propagation. Other two algorithms reveal big- 
ger, denser, and also more heterogeneous communities considering density scaling. The 
values of self-similar scales S are in accordance with these observations. Modularity op- 
timization extracts network with one community in the least number of scales on average 
(bigger communities). On the other hand, box-tiling obtains a larger number of reduced 
networks (smaller boxes), which is expected due to the distance Ib setting. 

To summarize, the analysis of real-world networks reveals power-law scaling of the 
network density with respect to network size. Among the employed renormalization tech- 
niques, balanced propagation seems to lead to the most optimal reduction of networks 
according to the density scaling. Results acquired by three best performing techniques in- 
dicate an existence of a certain common organizing principle of networks, which dictates 
linking rules and interactions among nodes. Our findings thus advance a recent dis- 
covery of a universal scaling of density among real- world networks [4(7, since we reveal 
density scaling also among different self-similar scales of complex real- world networks. In 
addition, the results are consistent with the densification laws of Leskovec et al. [HJHI] — 
TO (X n", where a ranges between 1 and 2 and relates with our exponent 7, which lies 
between and —1 respectively. Thus, our study expands densification laws to other 
dimensions of network structure. 

Besides density, we also studied the scaling of other network properties with respect 
to network size. In particular, we analyzed number of links, average and maximum 
degree, number of articulation points, average path and diameter [T], betweenness and 
closeness centrality [85 and clustering coefficient [511 • The results reveal significant 
scaling also between network size and average node or link betweenness — the number of 
shortest paths going through a node and link respectively. Regarding to a definition of 
network density and observed power-law relationship between size and density, similar 
relationship for number of links occurs expectedly. However, due to simplicity, detailed 
investigation of betweenness centrality scaling is omitted, although a prominent direction 
for future research. 



3.2. Random networks 

To further validate our results we apply box-tiling and modularity optimization renor- 
malizations to Erdos-Renyi random graphs with different sizes n and probabilities of 
linking nodes p. We generate networks with 500, 1000, 2500, 5000, and 10000 nodes 
and probabilities corresponding to density obtained with modularity optimization based 
renormalization (Section 3.1 ), density reported in [40j . and probability that should assure 
sufficient size of the largest network component |55] . 

Firstly, we test balanced propagation renormalization, since the method performs 
best on real-world networks. The results prove to be very good, showing fits closely to 
ideal (i?^ and p close to 1 and —1, respectively). However, detailed investigation shows 
renormalization for most of the generated networks reveals only a single scale or concludes 
without reduction, since random networks supposedly have no community structure. For 
this reason we exclude balanced propagation from the analysis. Thus, we study box-tiling 
as an illustration of classical box-covering principle and modularity optimization as an 
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Table 3: Estimates of the fit for power-law scaling of network density and size in Erdos-Renyi ran- 
dom graphs obtained with two renormalization techniques. For each probability of a link between two 

nodes p, we construct an ensemble of networks of various sizes. Values are estimates of the mean over 
10 realizations of each random graph. (See also Table |2]) 

p Technique p d S 

Randomized box-tiling n'oco n'^f^! 2.8 • n~° ''^^ 4.9 

Modularity optimization ll'^ol ^ '^^f^ 12.3 • ri~^ '^^^ 3.0 

—0.583 0.440 

Randomized box-tiling n'ooo n'^lf 3.7 • n^^'^^*" 4.5 

7 9.^-0.986 -0.8«2 U./til 

iv/r ^ 1 • -0.964 0.998 ^ _i 022 on 

Modularity optmiization ^ ^ 10.5 • n ' 3.0 

Randomized box-tiling 2.6 • n^"-^^"^ 6.7 

nit ^\ — 0.98o 0.962 

2/(n- 1) 

iv^ J , • -0.930 0.916 ^ , _i 

Modularity optmiization g 744 817 ^.4 • n ""'^ 4.0 



example of community based renormalization. Note that, in contrast to the above, the 
latter reveals non-trivial modules also in random networks ( Section |3.1[ ). 

The results appear in Tablejsj A strong relationship {R^ = 1, p = —1) arises between 
size and density of original networks. That occurs due to the settings of probability p. 
These strong fits cause also high values for original and randomized networks together. 
The results for randomized varieties of networks show low fits to the data and implies 
rather diverse density of reduced networks with respect to their size. This is anticipated 
owing to random network structure. However, the values of R^ and p for randomized 
networks under p = 2/{n — 1) setting are relatively high. Examining plot for box-tiling 
closely shows diverse density among reduced networks, however, diversity straightens 
due to the large number of reduction scales. On the other hand, networks reduced under 
modularity optimization on each scale reveal almost the same density, and thus lead to 
higher fit. Slightly greater values for renormalized networks under box-tiling seem to 
occur due to the definition of boxes, which consider only proximity among nodes. 

Other variables, including scaling exponent 7, constant c, and revealed self-similar 
scales S, comprehend greater range than values for real-world networks. This verifies 
there exists no optimal density characteristic for random networks and denotes that 
random networks do not exhibit common power-law density scaling. 

According to the above, we conclude that results for random networks appear to be 
weak as anticipated, since random networks should not reveal structures like communities 
in real- world networks. On the contrary, findings indicate that self-similar density scaling 
of real-world networks is not obtained by chance, and the scaling exists due to some inner 
principles which determine network structure. 

We have also analyzed several generative graph models, whether they reveal similar 
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Table 4: Estimates of the fit for power-law scaling of network density and size in five large real-world 
networks revealed with balanced propagation. Values are estimates of the mean over 10 renormalizations 
of each network. (See also Table [5]) 



Technique 


P 


i?2 




d 


s 


Balanced propagation 


-0.990 
-0.980 


0.977 
0.961 


2.9 • 


^-0.926 


4.9 


Original networks 


-0.900 


0.719 


66.2 







scaling of density as observed in real-world networks. Expectedly, under balanced propa- 
gation renormalization, classical scale-free [21 [87] and small- world graph [88] models show 
the same behavior as in the case of random graphs (due to the lack of community struc- 
ture) . On the other hand, forest fire model proposed by Leskovec et al. [JH HI] reveals 
strong power-law scaling of density with estimates similar to those observed in Table [2] 
(exact results are omitted). Interestingly, the model gives networks that also obey net- 
work densification laws, shrinking diameters, community structure and scale-free degree 
distributions [42], and thus provide a relatively realistic structure of real- world networks. 
Note that community guided attachment model [321 |3T] that also follows densification 
laws, does not reveal self-similar scaling of density; thus, the latter is indeed not an 
artifact of the former, but rather extends network density laws to other dimensions. 

3.3. Large real-world networks 

For a complete analysis, we also analyze the size and density relationship of the 
largest five real- world networks presented in Table [Tj In particular, co-purchase network 
of different products from Amazon in 2006, complete map of nd.edu domain, road network 
of Pennsylvania, communication network of user discussions on Wikipedia before January 
2008, and Internet topology graph from traceroutes in 2005. Due to simplicity, we present 
study only for the best performing balanced propagation based renormalization, where 
the maximum number of iterations is limited to 100. 

The results are presented in Table [4} Observing only original networks, fits are 
expectedly low due to small number of networks considered. For the same reason the 
constant c and exponent 7 also differ from the ones in Section [XT] However, other results 
show very good fit particularly for original and randomized networks together and reveal 
a power-law relationship of network size and density (see Fig. |3]). (Again, the results are 
statistically significant at one percent significance level.) As expected due to the size of 
the networks, the scaling exponent is close to —1. Number of self-similar scales is higher 
as in analysis in Section [XT] since networks are larger and thus reduced in more steps. 
On the other hand, S does not significantly increase with network size, which implies 
that renormalization is effective and efficient approach for simplifying large networks. 

Fig. [3] illustrates renormalized varieties of three large networks. We consider networks 
of diverse origin to value how different structure of networks effects the relationship be- 
tween size and density. For instance, Pennsylvania roads network shows very homo- 
geneous structure, while, on the contrary, other two networks present core-periphery 
structure typical for social and information networks. However, these diverse network 
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Balanced propagation Amazon co-punhase nd.edu domain 




Figure 3: (left) Power-law scaling of network density and size in five real-world networks with millions 
of links revealed with balanced propagation. Plot shows scaling of density over 10 renormalizations of 
each network. (Green triangles correspond to original networks, whereas blue circles represent their 
renormalized varieties. Symbol sizes are proportional to the number of networks with the same size and 
density.) (right) Density of network structure in renormalized varieties of three large real-world systems 
of different origin. (Node symbols correspond to degree-corrected clustering coefficient |4] that ranges 
between and 1 — green triangles and blue circles, respectively — while symbol sizes are proportional to 
the number of nodes in the original network.) 

structures do not reflect in the results (Fig.jsj left). Thus, the finding confirms common 
density scaling in real-world networks irrespective of network type and origin. 

Our study improves the comprehension of self-similar structure in real- world networks 
and implies several possible applications. Firstly, adequate network coarse-graining im- 
plies simplification and abstraction of large real-world networks without losing informa- 
tion about original network density. Reduction also enables visualization and improves 
the comprehension of larger complex networks. Additionally, self-similar density scaling 
can help at detecting sufficient density according to the size of the sub-graphs in graph 
sampling applications (e.g., [55]), improve the accuracy of link prediction (for review 
see (50]) and the quality of synthetic graph generation (e.g., [H]). 

4. Conclusions 

The paper explores the relationship between size and density of complex real-world 
networks under different box-covering and community-based renormalization techniques. 
The analysis was conducted on over 50 real-world networks of various sizes as well as 
Erdos-Renyi random graphs and different generative graph models. The main contribu- 
tion of the study is to imply an existence of a scale-free density not only among different 
real-world networks, but also among their self-similar scales. Common scaling of den- 
sity thus appears to be a unique property of complex real- world networks irrespective of 
their type, size and origin. Also, the results reveal balanced propagation based renor- 
malization as the best performing method among the observed algorithms. The study 
on Erdos-Renyi random graphs, which supposedly exhibit no community structure, val- 
idates the above results and confirms that observed scaling of density is distinctive for 
real-world networks. Hence, our findings expand recent discoveries to other dimensions 
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of network structure and further improve the comprehension of self-similarity in com- 
plex real-world networks. The latter has possible applications in graph sampling, link 
prediction, synthetic graph generation, network abstraction and visualization. 

In our future work we intend to focus on other possible characteristics of density 
scaling, that could be identified in networks of common type and origin. Furthermore, 
we will analyze the betweenness centrality scaling with respect to network size in detail. 
Moreover, the work will also be extended on finding suitable ways for abstracting large 
real-world networks, while at the same time preserving their fundamental properties. 
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